The primary ciliary dyskinesia-related genetic risk score is associated with susceptibility to adult-onset asthma

Background Disturbance of mucociliary clearance is an important factor in the pathogenesis of asthma. We hypothesized that common variants in genes responsible for ciliary function may contribute to the development of asthma with certain phenotypes. Methods Three independent adult Japanese populations (including a total of 1,158 patients with asthma and 2,203 non-asthmatic healthy participants) were studied. First, based on the ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar/), we selected 12 common single-nucleotide polymorphisms (SNPs) with molecular consequences (missense, nonsense, and 3’-untranslated region mutation) in 5 primary ciliary dyskinesia (PCD)-related genes and calculated a PCD-genetic risk score (GRS) as a cumulative effect of these PCD-related genes. Second, we performed a two-step cluster analysis using 3 variables, including PCD-GRS, forced expiratory volume in 1 second (%predicted FEV1), and age of asthma onset. Results Compared to adult asthma clusters with an average PCD-GRS, clusters with high and low PCD-GRS had similar overall characteristics: adult-onset, female predominance, preserved lung function, and fewer features of type 2 immunity as determined by IgE reactivity and blood eosinophil counts. The allele frequency of rs1530496, a SNP representing an expression quantitative trait locus (eQTL) of DNAH5 in the lung, showed the largest statistically significant difference between the PCD-GRS-High and PCD-GRS-Low asthma clusters (p = 1.4 x 10−15). Conclusion Genes associated with PCD, particularly the common SNPs associated with abnormal expression of DNAH5, may have a certain influence on the development of adult-onset asthma, perhaps through impaired mucociliary clearance.


Introduction
Asthma is a chronic obstructive airway disease with multiple phenotypes that may differ in disease pathobiology; the disease is caused by complex gene-environmental interactions.Several genome-wide association studies (GWASs) of asthma have been conducted to date [1,2], and some GWASs have identified novel candidate genes by focusing on specific phenotypes of asthma [3,4].By focusing on the adult-onset phenotype, we previously reported that HCG22, a member of a gene cluster encoding mucin-like proteins that is located on chromosome 6p21.3and is also known as a candidate gene for diffuse panbronchiolitis (DPB), is associated with asthma [5].DPB is a chronic obstructive airway disease characterized by neutrophilic bronchiolitis and mucus hypersecretion leading to impaired mucociliary clearance.
Importantly, impaired mucociliary clearance is broadly implicated in asthma pathogenesis, with or without a type 2 phenotype, and dysfunction of cilia may be in part responsible for poor mucus and/or allergen clearance from the airways [6].Impaired mucociliary function caused by allergic immune response also weakens the barrier integrity and self-cleaning abilities of the airway epithelium making it more vulnerable to penetration of allergens as well as of infection by bacteria and viruses [7].In a previous candidate gene analysis, KIF3A, a member of the kinesin superfamily of microtubule-associated motors that are important in the transport of protein complexes within cilia, was identified as a novel gene for childhood asthma [8].
Primary ciliary dyskinesia (PCD) is another airway disease characterized by impaired mucociliary clearance and airway neutrophilic inflammation caused by structural and/or functional abnormalities of cilia.A mutation in any one of the proteins required to build or regulate cilia can cause this disease; more than 40 related genes have been reported to date [9][10][11].A previous study analyzing bronchial airway epithelial cell gene expression in relation to fractional exhaled nitric oxide (FeNO) identified a distinct asthma phenotype mainly involving cilia-related genes [12].
In the present study, given the complexity and heterogeneity of asthma, we conducted candidate gene analysis of adult asthma by hypothesizing that common functional variants at multiple PCD-related genes, rather than a rare mutation at a single gene, contribute to the susceptibility to specific asthma phenotypes in which ciliary dysfunction could play an important role in the disease pathogenesis.

Results
The baseline characteristics of each of the 3 cohorts are shown in S1 Table .The median age of onset for patients with asthma in the 3 cohorts was approximately 40 years.Patients with asthma in Tsukuba Cohort 2 were characterized by fewer smokers (18.1%), and patients with asthma in Hokkaido Cohort were characterized by a lower prevalence of atopy (55.3%) and a higher prevalence of eosinophilic asthma (71.4%) compared to patients with asthma in the other cohorts.There was no significant association between each of the 12 single-nucleotide polymorphisms (SNPs) and the prevalence of asthma in any population (S2 Table ).
The two-step cluster analysis of patients with asthma in Tsukuba Cohort 1 identified 4 clusters, which were designated Clusters T1-A to -D in the order of decreasing PCD-genetic risk score (GRS) values.Patients with asthma in Tsukuba Cohort 2 and in Hokkaido Cohort were classified into 5 distinct clusters designated Clusters T2-A to -E and Clusters H-A to -E, respectively.The clinical characteristics of the identified asthma clusters in each cohort are summarized in Tables 1a-1c and 2.
The PCD-GRS of Cluster T1-A, Clusters T2-A and -B, and Cluster H-A were significantly higher than those of healthy participants in each population.On the other hand, the PCD-GRS of Cluster T1-D, Clusters T2-D and -E, and Clusters H-D and -E were significantly lower than those of healthy participants (Table 3).
To identify the genes or SNPs that contributed most strongly to the development of these particular clusters of asthma, the associations between each of 12 SNPs and asthma clusters with a higher PCD-GRS or with a lower PCD-GRS were examined in each cohort (Table 4).The SNPs, particularly those in DNAH5, were consistently associated with both asthma clusters with a higher GRS and a lower GRS, but the alleles associated with asthma were the opposite.When a genome-wide meta-analysis was conducted to clarify the genetic difference between asthma clusters with a higher PCD-GRS and clusters with a lower PCD-GRS using a combination of the two Tsukuba cohorts (Tsukuba Cohorts 1 and 2), the DNAH5 gene again showed the strongest significant association with the difference in allele frequency between these two asthma groups (Fig 1).Further association analysis between asthma clusters with a higher PCD-GRS and clusters with a lower PCD-GRS in the regions within 100 kb of DNAH5 (i.e., DNAH5 ±100 kb), where SNP density was increased by imputation, showed that most of the significant SNPs were expression quantitative trait loci (eQTLs) of DNAH5 in the lung (Table 5).
To identify the clinical characteristics of asthma clusters with high or low PCD-GRS values in greater detail, the asthma clusters with a higher PCD-GRS were combined as "PCD-GRS-Hi asthma", and the asthma clusters with a lower PCD-GRS were combined as "PCD-GRS-Lo asthma", and these groups were then compared to the combined remaining clusters designated as "PCD-GRS-average asthma" (Table 6).Both PCD-GRS-Hi and GRS-Lo asthma showed similar characteristics of late-onset, female-dominant, and relatively preserved lung function compared to PCD-GRS-average asthma.In addition, the prevalence of eosinophilic asthma was lowest in PCD-GRS-Hi asthma and the prevalence of atopic asthma was lowest in PCD-GRS-Lo asthma.
Because smoking has a profound effect on mucociliary function, an analysis using only nonsmokers was performed.The results did not differ from the analysis of all patients (S3 Table ).

Discussion
Stratifying asthma phenotypes using cluster analysis allowed us to successfully identify the genetic effects of PCD-related genes, especially of DNAH5, on asthma characterized by adultonset, female predominance, preserved lung function, and decreased frequencies of atopic and eosinophilic asthma.Furthermore, given that both wild-type and variant alleles of DNAH5 correlated positively with the similar phenotypes, we believe that the focus on specific asthma phenotypes by cluster analysis and the simultaneous examination of the effects of multiple SNPs using GRS helped us identify the genetic effects of the common SNPs at DNAH5 on adult-onset asthma.In fact, previous simple GWASs of asthma have not reported any genetic association with the DNAH5 region [13].
In a recent genetic association analysis with atopy, a significant interaction of the DNAH5 gene was reported with ADGRV1, a causative gene of Usher syndrome type 2, a disease characterized by congenital deafness and retinitis pigmentosa [14].Both genotypes of DNAH5 (rs2134256 TT or CC) were reported to have a positive effect on atopy, depending on the genotypes of ADGRV1 (rs17554723 AA or GG).Therefore, in the present study, the reason why both wild-type and variant alleles in DNAH5 were positively associated with similar adult- onset asthma phenotypes may also be due to the interaction with other genes or endotypes, although significant interactions between DNAH5 rs1530496 and the other 8 SNPs on different chromosomes examined in this study were not identified.In any case, even if the DNAH5 gene is primarily involved in the endotypes or molecular mechanisms associated with predisposition to mucosal ciliary dysfunction and non-type 2 airway inflammation, it is clear that the genetic effects of DNAH5 are not strong enough to cause the development of asthma with any specific phenotypes on its own.It is rather reasonable to assume that the DNAH5 gene exerts some influence on the expression of certain asthma phenotypes as a result of its interaction with other genetic and environmental factors.
According to the GTEx portal database [15] (https://gtexportal.org/home/), the alleles that increase the expression of DNAH5 mRNA in the lung are consistently associated with asthma with a lower PCD-GRS; the alleles that decreased the expression of DNAH5 mRNA in the lung are consistently associated with asthma with a higher PCD-GRS.Although the functional consequences of aberrant expression of the DNAH5 gene are not clear, a structural abnormality For each cohort, the maximum OR is graded as red, the minimum as blue, and the median as white.*Chi-squared test was used to compare allele frequencies between adult patients with and without asthma.Clusters were designated A to E in order of decreasing PCD-GRS.PCD-GRS, primary ciliary dyskinesia-genetic risk score; SNP, single nucleotide polymorphism; OR, odds ratio. https://doi.org/10.1371/journal.pone.0300000.t004 with an outer dynein arm (ODA) defect is the major finding in PCD, accounting for nearly 50% of cases [16].DNAH5 encodes the axonal heavy dynein chain of ODA and is responsible for ODA deficiency, and mutations at this gene accounted for the largest proportion (36.3%) of PCD cases in the study of Boaretto et al [17].In the Japanese population, mutations in DNAH5 have been found in 31% of PCD cases, second only to mutations in CCDC164, which have been seen in 49% of cases [18].In PCD patients with DNAH5 mutations, electron microscopic findings have shown two patterns of localization changes: complete loss of DNAH5 in ciliary axonemes with accumulation at the ciliary base; or DNAH5 distally absent only in the ciliary axonemes.The cilia in these patients were immotile or showed altered motility on highspeed video microscopy [19,20].
The phenotypes with lower prevalence of atopy or with lower eosinophil counts that were identified in the present study may reflect a causal link between aberrant expression of DNAH5 and the presence of antigen non-specific type-2 or neutrophilic airway inflammation (Table 6).When the associations of PCD-GRS with clinical information potentially related to abnormal mucociliary motility, such as exacerbations requiring antibiotics, long-term use of macrolides, and peripheral blood neutrophil counts, were examined in patients who had reliable information on these phenotypes, no clear associations were found, although the small -A, T2-A and -B) and a low PCD-GRS (T1-D, T2-D and -E  number of eligible patients precluded any firm conclusions (S4 Table ).Some patients with non-type 2 asthma, characterized by chronic productive cough and episodic infectious exacerbations, may be included in the recently proposed disease category of muco-obstructive lung disease, which includes COPD, PCD and bronchiectasis [21].Patients in this disease category often suffer from persistent bacterial infections caused by Haemophilus influenzae, Streptococcus pneumoniae, Moraxella catarrhalis, Klebsiella pneumoniae or Pseudomonas aeruginosa.A complex interaction between these chronic bronchial infections and intrinsic ciliary abnormalities may underlie the etiology of this disease category.Nevertheless, given that the frequency of patients with a high or low PCD-GRS that was associated with adult-onset asthma was found to be more than 60% of all asthmatics, it is likely that the endotype associated with the PCD-GRS more broadly underlies a variety of phenotypes with or without type-2 features, rather than just defining a specific phenotype associated with PCD-like clinical features.

Table 5. Meta-analysis of association studies comparing asthma clusters with a high PCD-GRS (T1
There are some limitations to this study.First, it is unclear how the higher or lower expression of DNAH5 affects ciliary function.Second, the number of SNPs examined in the current study was very limited, given that we targeted only SNPs that were present either in the

Group1: PCD-GRS-Hi asthma
Group2: PCD-GRS-Lo asthma Group3: PCD-GRS-average asthma -E,  H-D, -E PCD-GRS-Hi: combined clusters with higher PCD-GRS; PCD-GRS-Lo: combined clusters with lower PCD-GRS.*Atopy was defined as a positive response (>1.0 lumicount) to at least one of the 14 inhaled allergens.† Eosinophilic asthma was defined as a peripheral blood eosinophil count of more than 300 /μL or more than 5%.‡ For PCD-GRS, difference was found between all groups.For age, onset age, %predicted FEV 1 , FEV 1 /FVC, and total serum IgE, difference was found between GRS-Hi group and the -average group, and between GRS-Lo group and the -average group.For categorical covariates, adjusted residuals are shown in the table.The PCD-GRS of Group 1 or Group 2 was significantly different from that of 2203 healthy individuals (p = 2.2 x 10 −6 and p = 2.2 x 10 −6 respectively).FEV 1 , forced expiratory volume in 1 second; FVC, forced vital capacity; PCD, primary ciliary dyskinesia; GRS, genetic risk score.

P value ‡ (Combined group of T1-A, T2-A, -B, H-A) (Combined group of T1-D, T2-D,
https://doi.org/10.1371/journal.pone.0300000.t006 Infinium Asian Screening Array-24 v1.0 BeadChip [22] or the Illumina HumanHap550v3/ 610-Quad BeadChip [23].Third, data were insufficient to statistically evaluate the clinical features potentially related to mucociliary dysfunction, such as the number of exacerbations requiring antibiotics, the presence of long-term macrolide use, and peripheral blood neutrophil counts.Fourth, patients with asthma COPD overlap (so-called ACOs) were not excluded.However, it is unlikely that PCD-GRS is particularly associated with COPD pathogenesis, given that asthmatics who were found to have high or low PCD-GRS levels had no apparent impaired lung function, and excluding smokers from the analysis did not significantly change the results.Finally, although DNAH5 was significantly associated with the clusters identified in this study, the genetic backgrounds for adult-onset asthma cannot be fully explained by DNAH5; further validation, including the gene-gene and gene-environment interactions, will be necessary.
In conclusion, the current study suggested that impaired mucociliary clearance due to genetic susceptibility to ciliary dysfunction may underlie the pathogenesis of adult-onset asthma, which may give some insight into the complex interplay of several distinct endotypes underlying the extremely diverse clinical phenotypes of asthma patients.

Study population
We studied 3 independent adult Japanese populations (S1 Table ).The first population (Tsukuba Cohort 1) included 565 healthy participants and 537 patients with asthma.The second population (Tsukuba Cohort 2) included 965 healthy participants and 242 patients with asthma.Patients with asthma in these 2 cohorts were recruited from the University of Tsukuba Hospital and its affiliated hospitals, and healthy participants were recruited from the health checkup center of the Tsukuba Medical Center Hospital from June 1, 2008 through March 31, 2022 [24].The third population (Hokkaido Cohort) included 673 healthy participants and 446 patients with asthma, all of whom were recruited from the Hokkaido University Hospital [2].
In this study, asthma was defined as having recurrent episodes of 2 or more of 3 specific symptoms (coughing, wheezing, and dyspnea) known to be associated with reversible airflow obstruction and/or increased airway responsiveness [25].The presence of eosinophilic inflammation as indicated by peripheral blood eosinophilia and exhaled NO and the presence of atopy were also used as diagnostic references.Smokers were not excluded.
The age of onset of asthma was confirmed by interview, including episodes of dyspnea, wheezing, or coughing experienced during childhood and adolescence.In cases of uncertainty, the age of the earliest respiratory symptom was considered the age of onset.Individuals with a history of asthma or chronic obstructive disease were excluded from the healthy participants.genotyping was carried out using Infinium Asian Screening Array-24 v1.0 BeadChip (Illumina, San Diego, CA, USA) for individuals in Tsukuba Cohort 1 [22], and Illumina Human-Hap550v3/610-Quad BeadChip (Illumina) for individuals in Tsukuba Cohort 2 [23].For individuals in the Hokkaido Cohort, genotyping of the selected SNPs was performed using the TaqMan allele-specific amplification method (Applied Biosystems, Foster City, CA, USA) [26].

Gene and SNP selection
Although over 40 disease-causing genes have been described to date, this study focused on 29 functionally well-characterized causative genes of PCD (S5 Table) [6].Using the freely accessible ClinVar database (https://www.ncbi.nlm.nih.gov/clinvar/)[27], a public archive of reports of the relationships among human variations and phenotypes with supporting evidence, we searched for SNPs that have been reported to cause molecular consequences in the 29 PCDrelated genes, and we identified a total of 2,694 SNPs as of April 7, 2020.After exclusion of rare SNPs with minor allele frequencies (MAFs) < 0.1 in the Japanese population, SNPs were cross-referenced with our two sets of genome-wide genotyping SNP data (Tsukuba Cohort 1 and 2); a total of 29 SNPs at 14 genes remained.Since the genotyping platforms differed between Tsukuba Cohorts 1 and 2, genotype imputation in the 14 genes ± 100 kb regions of both cohorts was performed using the other cohort's SNP data as the reference panel.We used a two-step imputation approach.First, imputation with pre-phasing of the target dataset was performed.The haplotypes for each individual were estimated using MACH software (prephasing).Then, the genotype imputation with pre-phased haplotypes in the MACH framework was performed by Minimac3 [22].After excluding SNPs with low estimated imputation accuracies (Minimac r 2 < 0.3) or SNPs in tight linkage disequilibrium (LD) (r 2 > 0.5), 12 SNPs at 5 genes remained and were used for the further analysis (Fig 2, S2 Table ).

Calculating the multi-SNP genetic risk score (GRS)
We calculated the GRS of PCD-related genes using nonsense and missense mutations that cause protein loss or structural changes and mutations in the 3'-untranslated region (UTR), which is involved in mRNA stability and translation efficiency.Since pathogenic mutations of PCD are extremely rare, we focused on common SNPs that have not been reported to be involved in PCD pathogenesis, but may cause some functional effects on PCD-related genes.We then used PCD-GRS as a stratifying factor for cluster analysis.
As a measure of the cumulative effect of genes responsible for ciliary function, we calculated the PCD-GRS as a sum of the number of alleles causing molecular consequences at 12 loci for all individuals.Individual PCD-GRS values were calculated using the formula below:

Statistical analysis
To identify asthma phenotypes influenced by PCD-related genes, we conducted a two-step cluster analysis of patients with asthma using 3 variables (PCD-GRS, forced expiratory volume in 1 second (%predicted FEV 1 ), and age of onset) and IBM SPSS Statistics 26.According to our previous cluster analysis of adult asthma in the Japanese population, age at onset and % predicted FEV 1 were the most important discriminators for asthma [28].Indeed, several other cluster analyses have also identified 3 to 5 asthma phenotypes, with similarities among asthma phenotypes according to age of onset and lung function [29,30].Therefore, in the present study, we intentionally selected these variables in addition to the PCD-GRS.The similarity between each cluster was measured using log likelihood, and the optimal number of clusters was defined using Bayesian Information Criterion (BIC) values (https://www.ibm.com/docs/en/spss-statistics/24.0.0?topic=base-twostep-cluster-analysis).PCD-GRS values of identified clusters and healthy participants in each population were compared by two-tailed one-way analysis of variance (ANOVA) followed by Tukey's post hoc analysis.
To clarify the genetic influence of each gene or SNP on the asthma clusters with high PCD-GRS and low PCD-GRS, we conducted a GWAS in the Tsukuba Cohorts 1 and 2, followed by meta-analysis.Quality control of genotyping data was conducted under the condition of MAF � 1%, Hardy-Weinberg equilibrium test p � 1.0 x 10 −6 , sample call rate > 90%, and genotyping call rate > 90% using PLINK1.90[31].The numbers of SNPs were as follows: Tsukuba Cohort 1, 466,404; Tsukuba Cohort 2, 480,188; and meta-analysis, 66,993.Since DNAH5 was the gene with the strongest statistically significant association on meta-analysis, we further searched for SNPs with a significant difference in allele frequencies using SNP data within the DNAH5 ±100 kb region, where SNP density was increased by imputation.
We used a generalized linear model to examine the gene-gene interactions between DNAH5 rs1530496 and the other 8 SNPs on different chromosomes (rs12623642, rs3795958, rs2285943, rs10224537, rs2214326, rs7971, rs10250905, and rs600753) that were used to calculate the PCD-GRS.

Fig 1 .
Fig 1. Manhattan plot of the meta-analysis of genome-wide association studies comparing asthma clusters with a high PCD-GRS (T1-A, T2-A and -B) and a low PCD-GRS (T1-D, T2-D and -E) using Tsukuba Cohorts 1 and 2 (using asthma clusters with a high PCD-GRS as a reference).The Y-axis shows the-log10 P values, and the X-axis shows the SNP chromosomal positions.The position of the DNAH5 SNP is indicated by the arrow at the top of the plot.PCD-GRS, primary ciliary dyskinesia-genetic risk score; Chr, chromosome; SNP, single nucleotide polymorphism.https://doi.org/10.1371/journal.pone.0300000.g001

Table 1 . a. Characteristics
of asthma clusters identified in Tsukuba Cohort 1. b. Characteristics of asthma clusters identified in Tsukuba Cohort 2. c.Characteristics of asthma clusters identified in the Hokkaido Cohort.

Table 1 .
(Continued) *Smoking index was calculated for current and past smokers by multiplying smoking dose (cigarettes per day) and duration (years smoked).† Atopy was defined as a positive response (>1.0 lumicount) to at least one of the 14 inhaled allergens.‡ Eosinophilic asthma was defined as a peripheral blood eosinophil count of more than 300/μL or 5%.§ A two-tailed one-way analysis of variance (ANOVA), a Kruskal-Wallis test, or a chi-squared test was used to compare parametric continuous, nonparametric continuous, or categorical covariates, respectively.BMI, body mass index; FEV 1 , forced expiratory volume in 1 second; FVC, forced vital capacity; PCD, primary ciliary dyskinesia; GRS, genetic risk score.https://doi.org/10.1371/journal.pone.0300000.t001

Table 2 . Summary of clinical phenotypes of the identified asthma clusters.
Clusters were designated A to E in order of decreasing GRS.GRS, genetic risk score.https://doi.org/10.1371/journal.pone.0300000.t002

Table 3 . Tukey's HSD post hoc analysis of the PCD-GRS between healthy participants and asthma clusters in each cohort.
Clusters were designated A to E in order of decreasing PCD-GRS.PCD-GRS, primary ciliary dyskinesia-genetic risk score; SD, standard deviation.https://doi.org/10.1371/journal.pone.0300000.t003